neural information processing system 2022
LLM Embeddings for Deep Learning on Tabular Data
Koloski, Boshko, Margeloiu, Andrei, Jiang, Xiangjian, Škrlj, Blaž, Simidjievski, Nikola, Jamnik, Mateja
Tabular deep-learning methods require embedding numerical and categorical input features into high-dimensional spaces before processing them. Existing methods deal with this heterogeneous nature of tabular data by employing separate type-specific encoding approaches. This limits the cross-table transfer potential and the exploitation of pre-trained knowledge. We propose a novel approach that first transforms tabular data into text, and then leverages pre-trained representations from LLMs to encode this data, resulting in a plug-and-play solution to improv ing deep-learning tabular methods. We demonstrate that our approach improves accuracy over competitive models, such as MLP, ResNet and FT-Transformer, by validating on seven classification datasets.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
- Europe > Middle East > Cyprus (0.04)
- (8 more...)
- Research Report > Promising Solution (0.48)
- Research Report > New Finding (0.46)
- Overview > Innovation (0.34)
- Health & Medicine > Therapeutic Area (0.73)
- Education > Educational Setting (0.68)
A Survey on Data Markets
Zhang, Jiayao, Bi, Yuran, Cheng, Mengye, Liu, Jinfei, Ren, Kui, Sun, Qiheng, Wu, Yihang, Cao, Yang, Fernandez, Raul Castro, Xu, Haifeng, Jia, Ruoxi, Kwon, Yongchan, Pei, Jian, Wang, Jiachen T., Xia, Haocheng, Xiong, Li, Yu, Xiaohui, Zou, James
Data is the new oil of the 21st century. The growing trend of trading data for greater welfare has led to the emergence of data markets. A data market is any mechanism whereby the exchange of data products including datasets and data derivatives takes place as a result of data buyers and data sellers being in contact with one another, either directly or through mediating agents. It serves as a coordinating mechanism by which several functions, including the pricing and the distribution of data as the most important ones, interact to make the value of data fully exploited and enhanced. In this article, we present a comprehensive survey of this important and emerging direction from the aspects of data search, data productization, data transaction, data pricing, revenue allocation as well as privacy, security, and trust issues. We also investigate the government policies and industry status of data markets across different countries and different domains. Finally, we identify the unresolved challenges and discuss possible future directions for the development of data markets.
- Overview (1.00)
- Research Report > Experimental Study (0.45)
- Research Report > New Finding (0.45)
- Telecommunications (1.00)
- Law > Statutes (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- (14 more...)
Survey of Cultural Awareness in Language Models: Text and Beyond
Pawar, Siddhesh, Park, Junyeong, Jin, Jiho, Arora, Arnav, Myung, Junho, Yadav, Srishti, Haznitrama, Faiz Ghifari, Song, Inhwa, Oh, Alice, Augenstein, Isabelle
Large-scale deployment of large language models (LLMs) in various applications, such as chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure inclusivity. Culture has been widely studied in psychology and anthropology, and there has been a recent surge in research on making LLMs more culturally inclusive in LLMs that goes beyond multilinguality and builds on findings from psychology and anthropology. In this paper, we survey efforts towards incorporating cultural awareness into text-based and multimodal LLMs. We start by defining cultural awareness in LLMs, taking the definitions of culture from anthropology and psychology as a point of departure. We then examine methodologies adopted for creating cross-cultural datasets, strategies for cultural inclusion in downstream tasks, and methodologies that have been used for benchmarking cultural awareness in LLMs. Further, we discuss the ethical implications of cultural alignment, the role of Human-Computer Interaction in driving cultural inclusion in LLMs, and the role of cultural alignment in driving social science research. We finally provide pointers to future research based on our findings about gaps in the literature.
- North America > United States > Washington > King County > Seattle (0.27)
- Asia > South Korea (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- (67 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Education > Educational Setting > K-12 Education (1.00)
- (4 more...)